# Matryoshka Retriever

This is an implementation of the [Supabase](https://supabase.com/) blog post
["Matryoshka embeddings: faster OpenAI vector search using Adaptive Retrieval"](https://supabase.com/blog/matryoshka-embeddings).

![Matryoshka Retriever](/img/adaptive_retrieval.png)

### Overview

This class performs "Adaptive Retrieval" for searching text embeddings efficiently using the
Matryoshka Representation Learning (MRL) technique. It retrieves documents similar to a query
embedding in two steps:

- **First-pass**: Uses a lower dimensional sub-vector from the MRL embedding for an initial, fast,
  but less accurate search.

- **Second-pass**: Re-ranks the top results from the first pass using the full, high-dimensional
  embedding for higher accuracy.

This code demonstrates using MRL embeddings for efficient vector search by combining faster,
lower-dimensional initial search with accurate, high-dimensional re-ranking.

## Example

### Setup

import IntegrationInstallTooltip from "@mdx_components/integration_install_tooltip.mdx";

<IntegrationInstallTooltip></IntegrationInstallTooltip>

```bash npm2yarn
npm install @langchain/openai @langchain/community
```

To follow the example below, you need an OpenAI API key:

```bash
export OPENAI_API_KEY=your-api-key
```

We'll also be using `chroma` for our vector store. Follow the instructions [here](/docs/integrations/vectorstores/chroma) to setup.

import CodeBlock from "@theme/CodeBlock";
import Example from "@examples/retrievers/matryoshka_retriever.ts";

<CodeBlock language="typescript">{Example}</CodeBlock>

:::note
Due to the constraints of some vector stores, the large embedding metadata field is stringified (`JSON.stringify`) before being stored. This means that the metadata field will need to be parsed (`JSON.parse`) when retrieved from the vector store.
:::
